Goto

Collaborating Authors

 indicator function



Constrained Density Estimation via Optimal Transport

Hu, Yinan, Tabak, Estaban

arXiv.org Machine Learning

The classical optimal transport (OT) problem seeks the map that moves mass from a source to a target measure while minimizing a prescribed cost function. The objective can be formalized in either Monge's [12] or Kantronich's formulation [10], a convex relaxation of the former that considers transport plans instead of deterministic maps. These foundational formulations have wide-ranging applications, including to economics [7] and machine learning [14]. In many practical scenarios, the source measure is known or readily in-ferrable from empirical data but the target measure is not explicitly specified. Instead, it is only constrained by practical requirements or expert knowledge. For example, when applying Monge's formulation to transportation problems, the placement of the mass in the target region may be constrained to lie entirely beyond a certain boundary or within a particular region, rather than by the specification of a precise location for each fraction of the total mass. Similarly, in economic applications, supply and demand may be subject to constraints such as maximal amounts available or minimal amounts required, rather than dictated through precise marginal distributions. 1


Appendices A Proofs in Section 3

Neural Information Processing Systems

As the set of solutions to Eq. (3.4) is a line parallel to the subspace A.2 Proof of Lemma 2 For every θ E, we have Φ θ null= e . The auxiliary algorithm (A.1) can be rewritten in the following vector form Θ Bellman operator H is indifferent, i.e., H ( Q + x) H (Q) E, x E So it is impossible to apply the finite time analysis in the literature to establish the convergence of the iterates to some fix point. Then the following properties hold. Lemma 4.a) implies that (c So the Lemma 4.b) implies c Proposition 2. If M is L-smooth with respect to null null Now let's analyze the iterates generated by the following stochastic approximation scheme for solving We make the following assumptions regarding the function H and its stochastic sample ˆ H . Assumption 4. 1. H A and B . 3. There exist a fixed equivalent class, i.e., x Now we study the last term. Now let's focus on the last term in Notice that the monotonicity of infimal convolution (Lemma 4.a) and Lemma 4.b)) implies By update rule (B.5), we have E[ null null x Let's consider the decreasing stepsize first.


Cost-Sensitive Conformal Training with Provably Controllable Learning Bounds

Jia, Xuesong, Shi, Yuanjie, Liu, Ziquan, Xu, Yi, Yan, Yan

arXiv.org Machine Learning

Conformal prediction (CP) is a general framework to quantify the predictive uncertainty of machine learning models that uses a set prediction to include the true label with a valid probability. To align the uncertainty measured by CP, confor-mal training methods minimize the size of the prediction sets. A typical way is to use a surrogate indicator function, usually Sigmoid or Gaussian error function. However, these surrogate functions do not have a uniform error bound to the indicator function, leading to uncontrollable learning bounds. In this paper, we propose a simple cost-sensitive conformal training algorithm that does not rely on the indicator approximation mechanism. Specifically, we theoretically show that minimizing the expected size of prediction sets is upper bounded by the expected rank of true labels. To this end, we develop a rank weighting strategy that assigns the weight using the rank of true label on each data sample. Our analysis provably demonstrates the tightness between the proposed weighted objective and the expected size of conformal prediction sets. Extensive experiments verify the validity of our theoretical insights, and superior empirical performance over other con-formal training in terms of predictive efficiency with 21.38% reduction for average prediction set size.






Inexact Augmented Lagrangian Methods for Conic Programs: Quadratic Growth and Linear Convergence

Neural Information Processing Systems

Under the quadratic growth assumption, it is known that the dual iterates and the Karush-Kuhn-Tucker (KKT) residuals of ALMs applied to semidefi-nite programs (SDPs) converge linearly. In contrast, the convergence rate of the primal iterates has remained elusive.